Sub-Sampled Newton Methods I: Globally Convergent Algorithms

نویسندگان

  • Farbod Roosta-Khorasani
  • Michael W. Mahoney
چکیده

Large scale optimization problems are ubiquitous in machine learning and data analysis and there is a plethora of algorithms for solving such problems. Many of these algorithms employ sub-sampling, as a way to either speed up the computations and/or to implicitly implement a form of statistical regularization. In this paper, we consider second-order iterative optimization algorithms, i.e., those that use Hessian as well as gradient information, and we provide bounds on the convergence of the variants of Newton’s method that incorporate uniform sub-sampling as a means to estimate the gradient and/or Hessian. Our bounds are non-asymptotic, i.e., they hold for finite number of data points in finite dimensions for finite number of iterations. In addition, they are quantitative and depend on the quantities related to the problem, i.e., the condition number. However, our algorithms are global and are guaranteed to converge from any initial iterate. Using random matrix concentration inequalities, one can sub-sample the Hessian in a way that the curvature information is preserved. Our first algorithm incorporates such sub-sampled Hessian while using the full gradient. We also give additional convergence results for when the sub-sampled Hessian is regularized by modifying its spectrum or ridge-type regularization. Next, in addition to Hessian sub-sampling, we also consider sub-sampling the gradient as a way to further reduce the computational complexity per iteration. We use approximate matrix multiplication results from randomized numerical linear algebra (RandNLA) to obtain the proper sampling strategy. In all these algorithms, computing the update boils down to solving a large scale linear system, which can be computationally expensive. As a remedy, for all of our algorithms, we also give global convergence results for the case of inexact updates where such linear system is solved only approximately. This paper has a more advanced companion paper [40] in which we demonstrate that, by doing a finer-grained analysis, we can get problem-independent bounds for local convergence of these algorithms and explore tradeoffs to improve upon the basic results of the present paper. ∗International Computer Science Institute, Berkeley, CA 94704 and Department of Statistics, University of California at Berkeley, Berkeley, CA 94720. farbod/[email protected]. 1 ar X iv :1 60 1. 04 73 7v 3 [ m at h. O C ] 2 6 Fe b 20 16

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Globally Convergent Newton Algorithms for Blind Decorrelation

This paper presents novel Newton algorithms for the blind adaptive decorrelation of real and complex processes. They are globally convergent and exhibit an interesting relationship with the natural gradient algorithm for blind decorrelation and the Goodall learning rule. Indeed, we show that these two later algorithms can be obtained from their Newton decorrelation versions when an exact matrix...

متن کامل

A Class of Globally Convergent Algorithms for Pseudomonotone Variational Inequalities

We describe a fairly broad class of algorithms for solving variational inequalities, global convergence of which is based on the strategy of generating a hyperplane separating the current iterate from the solution set. The methods are shown to converge under very mild assumptions. Specifically, the problem mapping is only assumed to be continuous and pseudomonotone with respect to at least one ...

متن کامل

Evolutionary Computing for Operating Point Analysis of Nonlinear Circuits

The DC operating point of an electronic circuit is conventionally found using the NewtonRaphson method. This method is not globally convergent and can only find one solution of the circuit at a time. In this paper, evolutionary computing methods, including Genetic Algorithms, Evolutionary Programming, Evolutionary Strategies and Differential Evolution are explored as possible alternatives to Ne...

متن کامل

Nesterov's Acceleration For Approximate Newton

Optimization plays a key role in machine learning. Recently, stochastic second-order methods have attracted much attention due to their low computational cost in each iteration. However, these algorithms might perform poorly especially if it is hard to approximate the Hessian well and efficiently. As far as we know, there is no effective way to handle this problem. In this paper, we resort to N...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1601.04737  شماره 

صفحات  -

تاریخ انتشار 2016